gbt-HIPS: Explaining the Classifications of Gradient Boosted Tree Ensembles

نویسندگان

چکیده

This research presents Gradient Boosted Tree High Importance Path Snippets (gbt-HIPS), a novel, heuristic method for explaining gradient boosted tree (GBT) classification models by extracting single rule (CR) from the ensemble of decision trees that make up GBT model. CR contains most statistically important boundary values input space as antecedent terms. The represents hyper-rectangle inside which model is, very reliably, classifying all instances with same class label explanandum instance. In benchmark test using nine data sets and five competing state-of-the-art methods, gbt-HIPS offered best trade-off between coverage (0.16–0.75) precision (0.85–0.98). Unlike is also demonstrably guarded against under- over-fitting. A further distinguishing feature our that, unlike much prior work, explanations provide counterfactual detail in accordance widely accepted recommendations what makes good explanation.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Explaining Naïve Bayes Classifications

Naïve Bayes classifiers, a popular tool for predicting the labels of query instances, are typically learned from a training set. However, since many training sets contain noisy data, a classifier user may be reluctant to blindly trust a predicted label. We present a novel graphical explanation facility for Naïve Bayes classifiers that serves three purposes. First, it transparently explains the ...

متن کامل

Explaining Data-Driven Document Classifications

Many document classification applications require human understanding of the reasons for data-driven classification decisions: by managers, client-facing employees, and the technical team. Predictive models treat documents as data to be classified, and document data are characterized by very high dimensionality, often with tens of thousands to millions of variables (words). Unfortunately, due t...

متن کامل

Optimization of Tree Ensembles

Tree ensemble models such as random forests and boosted trees are among the most widely used and practically successful predictive models in applied machine learning and business analytics. Although such models have been used to make predictions based on exogenous, uncontrollable independent variables, they are increasingly being used to make predictions where the independent variables are cont...

متن کامل

Tree-Structured Boosting: Connections Between Gradient Boosted Stumps and Full Decision Trees

José Marcio Luna 1 Eric Eaton 2 Lyle H. Ungar 2 Eric Diffenderfer 1 Shane T. Jensen 3 Efstathios D. Gennatas 4 Mateo Wirth 3 Charles B. Simone II 5 Timothy D. Solberg 4 Gilmer Valdes 4 1 Dept. of Radiation Oncology, University of Pennsylvania {Jose.Luna,Eric.Diffenderfer}@uphs.upenn.edu 2 Dept. of Computer and Information Science, University of Pennsylvania {eeaton,ungar}@cis.upenn.edu 3 Dept. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Applied sciences

سال: 2021

ISSN: ['2076-3417']

DOI: https://doi.org/10.3390/app11062511